Fluctuations in models of biological macroevolution 
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Fluctuations in diversity and extinction sizes are discussed and compared for two different, 
individual-based models of biological coevolution. Both models display power-law distributions 
for various quantities of evolutionary interest, such as the lifetimes of individual species, the quiet 
periods between evolutionary upheavals larger than a given cutoff, and the sizes of extinction events. 
■ Time series of the diversity and measures of the size of extinctions give rise to flicker noise. Surpris- 

' ingly, the power-law behaviors of the probability densities of quiet periods in the two models differ, 

, while the distributions of the lifetimes of individual species are the same. 

<N ; 

^ ! I. INTRODUCTION 

Biological evolution presents many problems concerning highly nonlinear, nonequilibrium systems of interacting 
| entities that are well suited for study by methods from statistical mechanics. An excellent review of this rapidly 
growing field is found in Ref. Q. Among these problems are those concerned with coevolution: the simultaneous 
evolution of many species, whose mutual interactions produce a constantly changing fitness landscape. Throughout 
this process, new species are created and old species go extinct. A question that is still debated in evolutionary 
biology is whether evolution on the macroscale proceeds gradually or in 'fits and starts,' as suggested by Eldredge 
and Gould^*^ In the latter mode, known as punctuated equilibria, species and communities appear to remain largely 
unchanged for long periods of time, interrupted by brief (on a geological timescale) periods of mass extinctions and 
I 1 rapid change. 

A coevolution process involves a large range of timescales, from the ecologically relevant scales of a few generations, 
to geological scales of millions or billions of generations. Traditionally, models of macroevolution have been constructed 
t-H . on a highly coarse-grained timescale. (The best-known such model to physicists is probably the Bak-Sneppen model£) 
^ • However, the long-time dynamics of the evolution is clearly driven by ecological processes, mutations, and selection at 
comparatively short timescales. As a result, in recent years several new models have been proposed that are designed 
to span the disparate ecological and evolutionary timescales. These models include the Webworld modelfSi^ the 
Tangled-nature model^SiiS and simplified versions of the latter piLiLH In this paper I discuss and compare some of 
the properties of fluctuations in two simplified coevolution models: the model introduced in Ref. [Il| and a different 
' model that I am currently developing^ 

The rest of this paper is organized as follows. In Sec. [H] I introduce the two models, in Sec. 1 1 1 1 1 1 compare and 
discuss numerical results from large-scale Monte Carlo simulations for the different models, and in Sec. IIVI I present 
a summary and conclusions. 



II. MODELS 



Both of the models studied here consider haploid species whose genome is represented by a bit string of length L, 
so that there are a total of 2 L potential species, labeled by an index / £ [0, 2 L — 1]. At the end of each generation, an 
individual of species / can with probability Pj give rise to F offspring and then die, or with probability (1 — Pi) it 
dies without offspring. The fecundity F is assumed fixed and will be chosen appropriately for each model as discussed 
below. During its birth, an offspring individual may undergo mutation with a small probability fi/L per bit in its 
genome. During a mutation event, a bit in the genome is flipped, and the mutated offspring individual is considered 
as belonging to a different species than its parent. 

The reproduction probability Pi(t) of species / in generation t depends on the interactions of individuals of that 
species with other species, J, that are present in the community with nonzero populations nj, as well as possibly on 
its ability to utilize an external resource, R. It is given by the convenient nonlinear form, 

Pl{t) = l + eM*i(RAnj})} ' (1) 

which resembles the acceptance probability in a Monte Carlo simulation using Glauber transition rates^ The function 
depends linearly on the set of all nonzero populations in generation t, {nj(t)}, and possibly also on the external 



resource R. When is large and negative, the reproduction probability is near unity, while large positive values lead 
to a low reproduction probability. The coupling to the nj is effected via the interaction matrix M, whose elements 
Mjj are chosen randomly (with restrictions discussed below for the individual models) from a uniform distribution 
on [—1, +1]. If Mjj > and Mjj < 0, species I is a predator and J its prey, or vice versa, while Mu and Mjj both 
positive denotes mutualism or symbiosis, and both negative denote competition. Once M is chosen at the beginning 
of a simulation, it remains fixed ("quenched randomness"). These interactions (together with other, model-specific 
parameters) determine whether or not a species will have a sufficient reproduction probability to be successful in a 
particular community of other species. Typically, only a small subset of species have nonzero populations at any one 
time, forming a community. 
Model A: 

In Model A, which was introduced and studied in $/ takes the form 

*/({"./(*)}) = - ]T M u nj(t)/N tot (t) + N tot (t)/N , (2) 
J 

where N to t(t) = J2j n j(t) ^ s * ne total population at t. In this model, the Verhulst factor No prevents the population 
from indefinite growth and can be seen as representing an environmental "carrying capacity. 'Us The interaction matrix 
has elements that are randomly distributed over [— 1, +1], except that Mu — to focus on the effects of interspecies 
interactions. Here, Model A will be simulated with the following parameter values: F — 4 (see below), Ao = 2000, 
and (U = 10~ 3 . 
Model B: 

The new Model B is a predator/prey model. It is somewhat more realistic than Model A in that the population is 
maintained by a subset of the species that can utilize the external resource R (primary producers, or autotrophs). 
Other species can maintain themselves only by predation on one or more of the producer species (consumers, or 
heterotrophs). These modifications lead to the restrictions that the off-diagonal part of M must be antisymmetric, 
and further that if 7 is a producer and J a consumer, then / must be the prey of J, so that Mjj < 0. To avoid 
runaway population growth, the diagonal elements are chosen randomly on [—1,0). Such negative self interactions 
are commonly used in population dynamics models^ The specific form of $/ in this model is 

$f(R, {nj(t)» = b T - R m /N tot (t) - £ M u n,(t)/N tot (t) , (3) 

j 

where bj > can be seen as a "cost of reproduction," and r\j > is the coupling to the external resource. The latter 
is positive for producers and zero for consumers. In this implementation of the model, both bi and the nonzero r\i 
are chosen uniformly from (0, 1] at the beginning of the simulation and kept fixed thereafter. Here, Model B will be 
simulated with the following parameter values: F = 2 (see below), R — 2000, and [i = 10~ 3 . Only 5% of the potential 
species are producers (i.e., have r\i > 0), and in order to obtain food webs with more realistic connectivity, 90% of 
the Mjj j Mjj pairs are randomly chosen to be zero, giving a connectancel&12£&21£2^ of 10%, while the elements of 
the nonzero pairs are chosen on [—1, +1] as described above. 

Both models are sufficiently simple that the populations and stabilities of fixed-point communities can be obtained 
analytically for zero mutation rate. These analytical results are extensively compared with simulations in Refs. ITHIT^. 
andhj For the purposes of the present study I just mention that the analytical stability results yield intervals for 
the fecundity F, inside which it is ensured that a fixed-point community in the absence of mutations is stable toward 
small deviations in all directions. The values of F used here are chosen to yield such stable fixed-point communities 
for both models. 

My focus in this paper is to compare the fluctuations and long-time correlations in the two models for quantities of 
significance in ecology and evolutionary biology. One such quantity is the diversity of a community. This quantity is 
defined in many ways in the literature, most simply as the total number of species that exist with nonzero populations 
at a given time (also known as the species richness).? 4 Here I define it as D(t) = exp [S ({ni(t)})], where S is the 
information-theoretical entropji^*2&, 

S({nj(t)}) = - Pi(t)^Pi(t) (4) 

{I\pi(t)>Q} 

with pi(t) — nj(t)/N to t(t). This definition of diversity has the advantage that it is weighted away from species with 
very small populations, which in these models are likely to be short-lived, unsuccessful mutants. In ecology it is 
known as the Shannon- Wiener index^ It is expected to exhibit a low noise level during periods with little apparent 
evolutionary turnover (periods of stasis^^), while it should fluctuate much more strongly during highly active periods, 
such as mass extinctions or the emergence of new major species. Other quantities that can be monitored, and are 
expected to give similar information, are the number of species that go extinct in a single generation, either by itself, 
or weighted by the maximum population attained by the species that go extinct. 
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FIG. 1: (a): Typical diversity time series of T — 2 25 = 33 5 5 4 4 32 generations from Model A (black) and Model B (gray, 
green online). To facilitate printing, the data are sampled only every 2 15 = 32 768 generations, corresponding roughly to the 
graphical resolution of the figure. While this somewhat reduces the apparent range of the fluctuations, the general shape of 
the time series is preserved, (b): Histograms representing the probability density of the logarithmic derivative of the diversity, 
dS(t)/dt. The data were averaged over 16 generations in each run, and then averaged over 16 independent runs for Model A 
(solid, marked A) and 12 runs for Model B (dashed, marked B). The central parts of both histograms are well fitted by the 
same Gaussian distribution (dotted, marked G). 
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FIG. 2: Graphic representation of the community structure vs time, showing only species with populations between 101 and 
1000 (gray, green online) and above 1000 (black). See discussion in the text, (a): Model A. Data for the same simulation run 
illustrated in Fig. Q^a). (b): Model B. Data for the same simulation run illustrated in Fig. ^a). For ease of plotting, only 
every other data point is shown. 



III. NUMERICAL RESULTS AND ANALYSIS 



Both models were simulated for T = 2 25 = 33 5 5 4 4 32 generations, and several of their statistical properties were 
compared. Figure^a) contains time series for representative simulation runs for both models. The diversity of Model 
A is significantly less than for Model B, but both models show periods during which the relative fluctuations are 
modest (quiet periods) , separated by shorter periods during which the diversity changes strongly. Histograms of the 
logarithmic derivative of the diversity, dS(t)/dt, are shown in Fig. db). In both models the probability densities 
have Gaussian central peaks of very nearly the same width, with "heavy wings." While the Gaussian peak represents 
fluctuations due to the stochastic population dynamics^ and the creation and extinction of small populations of 
unsuccessful mutants, the wings represent large changes in the composition of the communities. This can be seen 
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FIG. 3: (a): Log-log plot of histograms representing the probability density of the durations of quiet periods, estimated as 
the periods between times when y = \dS(t)/dt\ (averaged over 16 generations) exceeds a cutoff y c . For Model A (solid) and 
Model B (dashed). Results for Model A were averaged over 16 runs, and for Model B over 12 runs. The error bars are standard 
errors, based on the spread between runs. The two dot-dashed straight lines represent 1/t 2 and 1/t power laws, respectively, 
(b): Log-log plot of histograms representing the probability density of the lifetimes of individual species for Model A (solid) 
and Model B (dashed). Results for Model A were averaged over 8 runs, and for Model B over 12 runs. The error bars were 
obtained as in part (a). The dot-dashed straight line represents a 1/t 2 power law. 



clearly by comparing the timing of the diversity fluctuations in Fig. ^ with Fig. [21 which shows the populations of 
individual species versus time for the same simulation runs shown in Fig.^ Quiet periods can be identified as periods 
between times when \dS(t)/dt\ exceeds a given threshold. However, the durations of the quiet periods so defined clearly 
depend on the magnitude of the threshold (discussed further below) and the functional form of the wings. Histograms 
representing the probability densities for the durations of quiet periods for both models are shown together in Fig.^a). 
Both histograms display approximate power-law behavior over at least five decades, but, somewhat surprisingly, the 
powers are significantly different: approximately 1/time 2 for Model A and 1/time for Model B. In contrast, the 
probability densities of the lifetimes of individual species (the time from the population of a species becomes nonzero 
until it again vanishes) are very close for the two models, both closely following a 1/time 2 power law over more than 
six decades (see Fig.EJb)). I believe this discrepancy between the distributions of the durations of quiet periods and 
the lifetimes of individual species is related to the fact that extinction events are much more highly synchronized in 
Model A than in Model B. This can be clearly seen from the time evolution of the community structure of the two 
models, shown in Fig. EI I n Model A all the major species in a community usually go extinct within a relatively short 
period of time, linking the duration of a quiet period closely to the lifetimes of its major constituent species. As a 
result, both the lifetime and quiet-period distributions for Model A show approximate 1/time 2 behavior. Conversely, 
extinctions in Model B tend to trigger further extinctions of only a part of the total food web that describes the 
community, producing a much weaker synchronization between individual extinctions and the duration of relatively 
quiet periods. This ability of the food web to avoid global collapse triggered by the extinction of a single species may 
be the cause of the wide 1/time distribution of quiet periods in this model. However, the 1/time distribution does not 
appear to be a direct consequence of the low connectance of the interactions for this model: tests of Model B with a 
connectance of 100% and 10% potential producers also resulted in a duration distribution with approximate 1/time 
decay. 

The time series for \dS(t)/dt\ is highly intermittent, and while the quiet periods are power-law distributed in both 
models, individual active periods, during which the activity remains continuously above the cutoff threshold, are 
generally exponentially distributed with a mean of just a few generations, except for very small thresholds. This was 
discussed for Model A in Ref. ITTl and I find the situation to be the same for Model B. 

An analysis method that is less dependent on the details of the distribution of the large fluctuations and the choice of 
cutoff, is the power spectral density (PSD) of a time series of a suitable global activity measure. It may be reasonable 
to consider both models as examples of extremal dynamics, in which the least "stable" parts of the system "collapse," 
leading to avalanches of readjustments throughout the system. It is known for such systems 2 ^ that if the probability 
density of the times between these local events has the long-time power-law dependence ~ l/i Tl with t\ < 2, then 
the PSD of the global activity will depend on frequency as l// a , where a = t\ — 1 (with a hard-to detect logarithmic 
correction if t\ = 2)i2& In the systems considered here, it seems reasonable to identify these local waiting times with 
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FIG. 4: (a): Log-log plot of octave-averaged power spectral densities (PSD) for the diversity D(t) for Model A (solid) 
and Model B (dashed). The straight, dot-dashed line is a guide to the eye, corresponding to 1// behavior. The data were 
averaged over 16 independent runs for Model A and 12 runs for Model B, and the error bars are standard errors based on 
the differences between independent runs, (b): Log- log plot of octave-averaged PSDs for the number of species going extinct 
in one generation (dashed) and for the number of species going extinct, weighted by the maximum population attained by 
each species, or extinction size (solid). Both PSDs refer to Model B and are averages over 12 independent runs. The straight, 
dot-dashed line corresponds to 1// behavior, and error bars were calculated as in part (a). 



the lifetimes of individual species, such that ri « 2 for both models. This reasoning then predicts that the PSD of 
the diversity in both cases should exhibit practically pure l/f noise (i.e., a ~ 1). Indeed, the PSDs for both models, 
which are shown in Fig. Ufa), exhibit l/f like behavior over many decades. Except for a difference in their overall 
magnitudes, the two PSDs behave essentially alike for frequencies between 10 -6 and 10 -2 generations -1 . For the 
very lowest frequencies, the PSD for Model A appears to level off, but the amount of data in this regime is small, and 
much longer simulations would be needed to accurately determine the PSDs below 10 -6 generations -1 . Power-law 
distributions and PSDs that fall off as 1/ f a with a w 1 have been obtained from the fossil record^ However, such 
power-law behaviors have only been observed over one or two decades in time or frequency, and there are arguments 
to the effect that l/f spectra extracted in Ref . 12^ mav be severely influenced by the analysis method^ 

Another quantity that can be used to determine the severity of an extinction event is the total number of species 
that go extinct in a single generation. However, this quantity is heavily influenced by the repeated extinctions of 
small populations of unsuccessful mutants of species with large populations. A better measure might therefore be the 
number of species that go extinct, weighted by the maximum population attained by each species during its lifetime. 
For convenience I shall call this latter measure the extinction size. It can be seen as a rough measure of the maximum 
fitness attained by the species that go extinct in a particular generation. PSDs of both quantities for Model B are 
shown in Fig. 01b). Like the diversity PSDs, both these PSDs show l/f like behavior for low frequencies. However, 
they also have a wide region of near- white noise for higher frequencies. 

In models governed by extremal dynamics, the exponents a for the frequency dependence of the PSD, t\ for the 
duration of local events (here interpreted as the lifetimes of individual species), and r for the duration of quiet 
periods where the activity measure (here \dS(t)/dt\) falls below some cutoff y c , are connected by the following scaling 
relations >21 

a = 1 — /j, 
n = 2 - n 

t = 1 + fi — a , (5) 

where \x and a are two independent exponents that depend on the spatial dimensionality d of the underlying model 
and both vanish in the limit d — > O^i One thus notes that (assuming that my identification of the quantities measured 
in the models studied here with quantities in the extremal-dynamics systems is valid) the exponents a — 1, t\ =2, 
and t = 1, observed for Model B, coincide with those of a zero-dimensional extremal-dynamics system. On the other 
hand, Model A does not seem to fit into this pattern. It remains a question for further study to decide how appropriate 
this analysis is. 

As seen in Fig. I^a), the probability density of the duration s of quiet periods in Model B is well described as 
p(s) ~ s -T with t ~ 1 , multiplied by a function that approaches zero for large s and a constant for small s. 
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FIG. 5: (a): Log-log plots of histograms representing the probability density p(s,y c ) of the duration s of quiet periods for 
Model B, as obtained with different values of the cutoff y c . The dot-dashed straight line corresponds to 1/s power-law behavior, 
(b): Log-log plot of the scaling function f(x) = sp(s,y c ) vs the scaling variable x = s/(s\y c ). 

Figure EJa) shows that this function depends on the cutoff y c , and it is naturally expressed as a scaling function, 

P{s,Vc) = s~ T f(s/g(y c )) . (6) 

The mean value of s as calculated from the numerical p(s,y c ), (slyc), would seem a natural choice for g{y c ), as long 
as it is much less than the total time T of the simulation. Thus, the argument of the scaling function would be 
x = s/(s\y c ). In the relatively narrow range of the four values of y c used in Fig. |SJa), (s\y c ) is well approximated by 
the power law (s\y c ) — 27948 (y c /0.008) 7 with 7 w 7.84, and the scaling function, 

f(x) = s T p(s,y c ) = x T (s|y c ) r_1 (s\y c )p(x(s\y c ),y c ) = x T {s\y c ) T ~ 1 p(x) . (7) 

The resulting f(x) is shown in Fig. |S{b) for r = 1. The data collapse is reasonable, but it is not very sensitive to the 
value of 7, which can be reduced to near 5 without giving a significantly inferior collapse. 

IV. CONCLUSION 

In this paper I have discussed the fluctuations in two different, simplified models of biological macroevolution 
that both are based on the birth/death behavior of single individuals on the ecological timescale. On evolutionary 
timescales both models give rise to power-law distributions of characteristic waiting times and power spectra that 
exhibit 1// like flicker noise, in agreement with some interpretations of the fossil recordi^i^ii The main difference 
in the construction of the models is that in Model A the population size is controlled by a Verhulst factor, while in 
Model B the population is maintained by a small percentage of the species that have the ability to directly utilize an 
external resource (producers or autotrophs). All other species must maintain themselves as predators (consumers or 
heterotrophs). In both models the probability density of the lifetimes of individual species follows a 1/time 2 power 
law and, consistently, the power spectra of the time series of diversity and extinction sizes show 1// noise. However, 
the probability density of quiet periods, defined as the times between events when the magnitude of the logarithmic 
derivative of the diversity exceeds a given cutoff, behaves differently in the two models. In Model B, the quiet-period 
distribution goes as 1/time, consistent with the universality class of the zero-dimensional Bak-Sneppen model. In 
contrast, in Model A the behavior is proportional to 1/time 2 , like the lifetimes of individual species. The difference 
between the behaviors can be linked to the lower degree of synchronization between extinction events in Model B. It 
remains a topic for further research to see whether this difference extends to more realistic modifications of Model B. 
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